Overview

Dataset statistics

Number of variables9
Number of observations1991
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory140.1 KiB
Average record size in memory72.1 B

Variable types

Numeric9

Alerts

do is highly correlated with wqiHigh correlation
tc is highly correlated with wqiHigh correlation
wqi is highly correlated with do and 1 other fieldsHigh correlation
do is highly correlated with wqiHigh correlation
wqi is highly correlated with doHigh correlation
Unnamed: 0 is highly correlated with stationHigh correlation
station is highly correlated with Unnamed: 0High correlation
do is highly correlated with wqiHigh correlation
bod is highly correlated with tcHigh correlation
tc is highly correlated with bod and 1 other fieldsHigh correlation
wqi is highly correlated with do and 1 other fieldsHigh correlation
ph is highly skewed (γ1 = 27.34385201) Skewed
tc is highly skewed (γ1 = 31.71407152) Skewed
Unnamed: 0 is uniformly distributed Uniform
Unnamed: 0 has unique values Unique
na has 41 (2.1%) zeros Zeros

Reproduction

Analysis started2022-01-23 17:02:31.334425
Analysis finished2022-01-23 17:03:08.153828
Duration36.82 seconds
Software versionpandas-profiling v3.1.1
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ≥0)

HIGH CORRELATION
UNIFORM
UNIQUE

Distinct1991
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean995
Minimum0
Maximum1990
Zeros1
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size15.7 KiB
2022-01-23T22:33:08.399931image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile99.5
Q1497.5
median995
Q31492.5
95-th percentile1890.5
Maximum1990
Range1990
Interquartile range (IQR)995

Descriptive statistics

Standard deviation574.8965124
Coefficient of variation (CV)0.5777854396
Kurtosis-1.2
Mean995
Median Absolute Deviation (MAD)498
Skewness0
Sum1981045
Variance330506
MonotonicityStrictly increasing
2022-01-23T22:33:08.768850image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19891
 
0.1%
13341
 
0.1%
13081
 
0.1%
13101
 
0.1%
13121
 
0.1%
13141
 
0.1%
13161
 
0.1%
13181
 
0.1%
13201
 
0.1%
13221
 
0.1%
Other values (1981)1981
99.5%
ValueCountFrequency (%)
01
0.1%
11
0.1%
21
0.1%
31
0.1%
41
0.1%
51
0.1%
61
0.1%
71
0.1%
81
0.1%
91
0.1%
ValueCountFrequency (%)
19901
0.1%
19891
0.1%
19881
0.1%
19871
0.1%
19861
0.1%
19851
0.1%
19841
0.1%
19831
0.1%
19821
0.1%
19811
0.1%

station
Real number (ℝ≥0)

HIGH CORRELATION

Distinct320
Distinct (%)16.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1947.967855
Minimum2
Maximum3473
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.7 KiB
2022-01-23T22:33:09.184746image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile1092
Q11456
median1861
Q32336.5
95-th percentile3360
Maximum3473
Range3471
Interquartile range (IQR)880.5

Descriptive statistics

Standard deviation722.1032617
Coefficient of variation (CV)0.3706956764
Kurtosis0.2392620535
Mean1947.967855
Median Absolute Deviation (MAD)437
Skewness0.03799323932
Sum3878404
Variance521433.1206
MonotonicityNot monotonic
2022-01-23T22:33:09.568956image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1861129
 
6.5%
140010
 
0.5%
139910
 
0.5%
157210
 
0.5%
154710
 
0.5%
109410
 
0.5%
157010
 
0.5%
115110
 
0.5%
157310
 
0.5%
164210
 
0.5%
Other values (310)1772
89.0%
ValueCountFrequency (%)
21
 
0.1%
1710
0.5%
1810
0.5%
2010
0.5%
2110
0.5%
4210
0.5%
4310
0.5%
10239
0.5%
10249
0.5%
10259
0.5%
ValueCountFrequency (%)
34733
0.2%
34713
0.2%
34683
0.2%
34663
0.2%
34653
0.2%
34643
0.2%
34603
0.2%
34593
0.2%
34583
0.2%
33843
0.2%

do
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct165
Distinct (%)8.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.397422401
Minimum0
Maximum11.4
Zeros1
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size15.7 KiB
2022-01-23T22:33:09.936636image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3.9
Q15.95
median6.7
Q37.2
95-th percentile8
Maximum11.4
Range11.4
Interquartile range (IQR)1.25

Descriptive statistics

Standard deviation1.323062347
Coefficient of variation (CV)0.2068117851
Kurtosis3.847240171
Mean6.397422401
Median Absolute Deviation (MAD)0.6
Skewness-1.468481491
Sum12737.268
Variance1.750493974
MonotonicityNot monotonic
2022-01-23T22:33:10.295128image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6.7131
 
6.6%
6.8118
 
5.9%
6.9103
 
5.2%
798
 
4.9%
6.691
 
4.6%
7.287
 
4.4%
7.182
 
4.1%
7.379
 
4.0%
6.564
 
3.2%
6.463
 
3.2%
Other values (155)1075
54.0%
ValueCountFrequency (%)
01
 
0.1%
0.21
 
0.1%
0.51
 
0.1%
0.64
0.2%
0.72
0.1%
0.82
0.1%
0.92
0.1%
12
0.1%
1.11
 
0.1%
1.21
 
0.1%
ValueCountFrequency (%)
11.41
0.1%
11.11
0.1%
102
0.1%
9.92
0.1%
9.81
0.1%
9.62
0.1%
9.41
0.1%
9.31
0.1%
9.21
0.1%
9.11
0.1%

ph
Real number (ℝ≥0)

SKEWED

Distinct265
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean111.6696168
Minimum0
Maximum67115
Zeros2
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size15.7 KiB
2022-01-23T22:33:10.631173image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6.4
Q16.9
median7.3
Q37.7
95-th percentile8.4
Maximum67115
Range67115
Interquartile range (IQR)0.8

Descriptive statistics

Standard deviation1875.161891
Coefficient of variation (CV)16.79205092
Kurtosis876.0255587
Mean111.6696168
Median Absolute Deviation (MAD)0.4
Skewness27.34385201
Sum222334.207
Variance3516232.118
MonotonicityNot monotonic
2022-01-23T22:33:11.104862image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.2138
 
6.9%
7.3134
 
6.7%
7.4127
 
6.4%
7.1118
 
5.9%
7112
 
5.6%
7.6110
 
5.5%
6.9102
 
5.1%
7.797
 
4.9%
7.896
 
4.8%
7.591
 
4.6%
Other values (255)866
43.5%
ValueCountFrequency (%)
02
0.1%
2.61
0.1%
2.72
0.1%
2.92
0.1%
31
0.1%
3.051
0.1%
3.12
0.1%
3.21
0.1%
3.31
0.1%
5.271
0.1%
ValueCountFrequency (%)
671151
0.1%
285981
0.1%
243361
0.1%
213311
0.1%
208501
0.1%
99481
0.1%
94161
0.1%
33841
0.1%
18351
0.1%
17081
0.1%

co
Real number (ℝ≥0)

Distinct1004
Distinct (%)50.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1766.332461
Minimum0.4
Maximum65700
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.7 KiB
2022-01-23T22:33:11.501640image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.4
5-th percentile31.5
Q179
median183
Q3568.5
95-th percentile12725.5
Maximum65700
Range65699.6
Interquartile range (IQR)489.5

Descriptive statistics

Standard deviation5520.179564
Coefficient of variation (CV)3.12522115
Kurtosis31.28977432
Mean1766.332461
Median Absolute Deviation (MAD)125
Skewness5.04661423
Sum3516767.93
Variance30472382.42
MonotonicityNot monotonic
2022-01-23T22:33:11.846586image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18330
 
1.5%
6215
 
0.8%
5515
 
0.8%
5915
 
0.8%
6714
 
0.7%
6114
 
0.7%
4913
 
0.7%
5212
 
0.6%
6512
 
0.6%
8012
 
0.6%
Other values (994)1839
92.4%
ValueCountFrequency (%)
0.41
0.1%
3.61
0.1%
3.72
0.1%
41
0.1%
4.51
0.1%
4.61
0.1%
4.82
0.1%
52
0.1%
5.41
0.1%
5.61
0.1%
ValueCountFrequency (%)
657001
0.1%
485001
0.1%
471561
0.1%
461701
0.1%
446001
0.1%
440001
0.1%
439831
0.1%
423541
0.1%
395031
0.1%
372271
0.1%

bod
Real number (ℝ≥0)

HIGH CORRELATION

Distinct407
Distinct (%)20.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.8311223
Minimum0.1
Maximum534.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.7 KiB
2022-01-23T22:33:12.199477image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.1
5-th percentile0.6
Q11.2
median1.8965
Q33.6
95-th percentile22.15
Maximum534.5
Range534.4
Interquartile range (IQR)2.4

Descriptive statistics

Standard deviation29.08989794
Coefficient of variation (CV)4.258436119
Kurtosis181.0025391
Mean6.8311223
Median Absolute Deviation (MAD)0.8965
Skewness12.3976801
Sum13600.7645
Variance846.2221622
MonotonicityNot monotonic
2022-01-23T22:33:12.552432image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.577
 
3.9%
174
 
3.7%
1.272
 
3.6%
1.169
 
3.5%
1.665
 
3.3%
1.463
 
3.2%
0.963
 
3.2%
1.362
 
3.1%
0.857
 
2.9%
1.955
 
2.8%
Other values (397)1334
67.0%
ValueCountFrequency (%)
0.11
 
0.1%
0.251
 
0.1%
0.2671
 
0.1%
0.281
 
0.1%
0.35
 
0.3%
0.419
1.0%
0.4141
 
0.1%
0.4251
 
0.1%
0.4581
 
0.1%
0.4671
 
0.1%
ValueCountFrequency (%)
534.51
0.1%
513.51
0.1%
441.81
0.1%
431.51
0.1%
3591
0.1%
3541
0.1%
341.8331
0.1%
195.41
0.1%
185.81
0.1%
1851
0.1%

na
Real number (ℝ≥0)

ZEROS

Distinct506
Distinct (%)25.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.497969362
Minimum0
Maximum108.7
Zeros41
Zeros (%)2.1%
Negative0
Negative (%)0.0%
Memory size15.7 KiB
2022-01-23T22:33:12.921065image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.0815
Q10.28
median0.516
Q31.2
95-th percentile5.08
Maximum108.7
Range108.7
Interquartile range (IQR)0.92

Descriptive statistics

Standard deviation3.868221118
Coefficient of variation (CV)2.582309903
Kurtosis323.9800913
Mean1.497969362
Median Absolute Deviation (MAD)0.316
Skewness13.85574975
Sum2982.457
Variance14.96313462
MonotonicityNot monotonic
2022-01-23T22:33:13.280904image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.516225
 
11.3%
0.177
 
3.9%
0.455
 
2.8%
151
 
2.6%
0.250
 
2.5%
0.344
 
2.2%
041
 
2.1%
0.529
 
1.5%
0.623
 
1.2%
0.0821
 
1.1%
Other values (496)1375
69.1%
ValueCountFrequency (%)
041
2.1%
0.011
 
0.1%
0.024
 
0.2%
0.033
 
0.2%
0.042
 
0.1%
0.057
 
0.4%
0.0614
 
0.7%
0.077
 
0.4%
0.0821
1.1%
0.0831
 
0.1%
ValueCountFrequency (%)
108.71
0.1%
58.11
0.1%
25.711
0.1%
20.451
0.1%
20.31
0.1%
20.21
0.1%
19.691
0.1%
19.61
0.1%
19.42
0.1%
19.351
0.1%

tc
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1093
Distinct (%)54.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean498335.6188
Minimum0
Maximum511090873
Zeros5
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size15.7 KiB
2022-01-23T22:33:13.629051image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile22
Q1118
median468
Q31696.5
95-th percentile38547.5
Maximum511090873
Range511090873
Interquartile range (IQR)1578.5

Descriptive statistics

Standard deviation13754731.42
Coefficient of variation (CV)27.60134115
Kurtosis1076.580903
Mean498335.6188
Median Absolute Deviation (MAD)409
Skewness31.71407152
Sum992186217
Variance1.891926364 × 1014
MonotonicityNot monotonic
2022-01-23T22:33:13.997597image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
468133
 
6.7%
1014
 
0.7%
3312
 
0.6%
4512
 
0.6%
6311
 
0.6%
3610
 
0.5%
419
 
0.5%
329
 
0.5%
229
 
0.5%
3509
 
0.5%
Other values (1083)1763
88.5%
ValueCountFrequency (%)
05
 
0.3%
21
 
0.1%
34
 
0.2%
43
 
0.2%
57
0.4%
67
0.4%
72
 
0.1%
86
0.3%
93
 
0.2%
1014
0.7%
ValueCountFrequency (%)
5110908731
0.1%
3000000001
0.1%
1604053921
0.1%
64000001
0.1%
9675001
0.1%
7125001
0.1%
6225001
0.1%
4407501
0.1%
3925001
0.1%
3580001
0.1%

wqi
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct257
Distinct (%)12.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean77.03411351
Minimum19.3
Maximum99.8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.7 KiB
2022-01-23T22:33:14.381745image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum19.3
5-th percentile50.2
Q171.74
median79.68
Q387.66
95-th percentile93.64
Maximum99.8
Range80.5
Interquartile range (IQR)15.92

Descriptive statistics

Standard deviation12.85271857
Coefficient of variation (CV)0.1668445054
Kurtosis1.824096664
Mean77.03411351
Median Absolute Deviation (MAD)7.94
Skewness-1.286896152
Sum153374.92
Variance165.1923746
MonotonicityNot monotonic
2022-01-23T22:33:14.703371image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
87.6690
 
4.5%
88.3882
 
4.1%
82.9476
 
3.8%
82.0467
 
3.4%
83.765
 
3.3%
66.4459
 
3.0%
88.257
 
2.9%
82.7650
 
2.5%
88.5645
 
2.3%
94.1844
 
2.2%
Other values (247)1356
68.1%
ValueCountFrequency (%)
19.31
 
0.1%
21.521
 
0.1%
23.441
 
0.1%
28.121
 
0.1%
28.661
 
0.1%
28.661
 
0.1%
30.042
 
0.1%
30.541
 
0.1%
32.781
 
0.1%
33.3415
0.8%
ValueCountFrequency (%)
99.81
 
0.1%
99.623
 
0.2%
99.443
 
0.2%
98.91
 
0.1%
94.761
 
0.1%
94.224
 
0.2%
94.1844
2.2%
9419
1.0%
93.8223
1.2%
93.6412
 
0.6%

Interactions

2022-01-23T22:33:04.247365image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:38.103921image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:41.545089image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:44.626256image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:47.408536image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:52.103786image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:55.429318image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:58.284169image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:01.507954image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:04.568651image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:38.657937image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:41.930746image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:44.923110image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:48.466674image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:52.629609image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:55.760128image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:58.603209image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:01.818462image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:04.904073image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:38.986040image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:42.244924image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:45.219963image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:48.972381image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:53.095969image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:56.096936image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:58.909328image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:02.139848image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:05.202290image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:39.251647image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:42.536101image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:45.501194image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:49.488085image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:53.389381image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:56.389703image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:59.340490image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:02.434809image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:05.523099image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:39.670222image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:42.873332image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:45.798050image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:49.841651image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:53.758100image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:56.725344image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:59.698292image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:02.724382image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:06.001740image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:39.998322image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:43.185808image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:46.157401image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:50.418727image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:54.146053image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:57.050784image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:00.188407image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:03.001206image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:06.298817image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:40.310801image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:43.490813image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:46.423124image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:50.904158image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:54.462872image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:57.357027image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:00.635062image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:03.317236image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:06.601369image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:40.654528image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:43.784187image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:46.737562image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:51.263507image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:54.777694image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:57.651229image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:00.914727image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:03.618634image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:06.919158image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:41.107621image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:44.302539image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:47.080439image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:51.579280image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:55.093511image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:32:57.973166image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:01.202696image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-23T22:33:03.923809image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-01-23T22:33:15.029828image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-01-23T22:33:15.582577image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-01-23T22:33:15.966738image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-01-23T22:33:16.351024image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-01-23T22:33:07.431712image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-01-23T22:33:07.952704image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

Unnamed: 0stationdophcobodnatcwqi
001393.06.77.5203.01.89650.127.093.82
111399.05.77.2189.02.00000.28391.076.96
221475.06.36.9179.01.70000.15330.079.28
333181.05.86.964.03.80000.58443.069.34
443182.05.87.383.01.90000.45500.077.14
551400.05.57.481.01.50000.14049.077.14
661476.06.16.7308.01.40000.35672.075.44
773185.06.46.7414.01.00000.29423.075.44
883186.06.47.6305.02.20000.14990.082.04
993187.06.37.677.02.30000.14301.082.76

Last rows

Unnamed: 0stationdophcobodnatcwqi
198119811160.07.3178.06.71.50.138190.072.06
198219821161.07.1214.06.82.30.585350.072.06
198319831162.07.5293.07.21.20.56835.077.68
198419841328.06.9146.07.12.00.50638.077.68
198519851329.07.0136.07.51.40.609205.072.06
198619861330.07.9738.07.22.70.518202.072.06
198719871450.07.5585.06.32.60.155315.072.06
198819881403.07.698.06.21.20.516570.066.44
198919891404.07.791.06.51.30.516562.066.44
199019901726.07.6110.05.71.10.516546.066.44